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ABSTRACT 

* ' A strategy for overcoming problems with the Rasch 

model's inability to handle missing data involves a pairwise 
algorithm which manipulates the data matrix to separate out the y 
information needed for the estimation of item difficulty parameters 
in*a test. The method' of estimation compares two or three items at a 
time, separating out the ability parameters of the set of persons 
tested by means of conditiona\ probability, to avoid biasing the 
difficulty parameters. To describe the difficulty of a set of items, 
a matrix is constructed in which each'element is the number of people 
who responded correctly to one item and incorrectly to another item. 
The matrix of observations need not be complete. An analogous matrix 
is prepared to describe *he relative* abilities of the persons, by 
considering them two at a time and looking only at those items which 
one got right and the other got wrong. Maximum likelihood estimation 
and tests for fit of the parameter estimates ate examined. (CM) 
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The Nature of the Problem * 

Ra,sch (19601, in 6 book on stochastic item Response models, set 
out a "structural model for test t terns" which subsequently came ^o 
bear his name. In this book, Rasch discussed the model's basic 
assumptions, in some detail \'and began to explore its mathematics. 
Updating his notation somewhat to conform with current usage, the 
Rasch model can more simply be written: . • 



r -i ' e (av " ^ 

Probability X vl - = 1 = s - 

L * J \ + e (a *~ 6i ) 




( 



where X V1 - , the outcome of persojj v attempting item i, is one if the 

response is correct, zero otherwise. cty is a parameter describing 

the ability of person v and* 5i is a] parameter describing the 

[ . - v 

difficulty of item i. - ^ 

In* most applications, use of the model involves using a large 

number of observed X val ues,^ often » arranged in a persons-by-o terns 

matrix, to . estimate the values of - a for "the set of people being 

tested, and 5 for the items in the test. In the 1960 book, Rasch 

/ 

makes only hesitant stjeps towards procedures for estimating a and 
5 because of limitatfidns in the computational facilities available 



to him, and. his own preference for simple, graphical methods and 

Intuition. However, he did sketch out an analytic procedure^ (p. 

t » 

178-1,81) for obtaining maximum 'likelihood estimates of both the 
or's and the. '5's. Unfortunately, this procedure depended on "a 
mastery of the coefficients that is not yet at the ^disjpsal of 

the author" in which the' persons by items matrix is analyzed. These 
coefficients represent the number of different but possible^ patter^, 
of ones and zeros in the matrix that would yield the observed marginal 
values. Rasch offered some formulae for calculating the coefficients 
based on summing 9 elementary symmetric functions, and the method was 
successfully „ demonstrated by Wright as early as 1965. However, the 
number of calculations required to determine the elementary symmetric 
function^ increases as k* where k is the number 'of itejns in the test. 
Wright points out that this makes the method prohibitively expensive 
even with the speed and capacity of modern computers, and also 
inaccurate because round-off errors accumulate during Ijjie calcula- 
tions. In practice this estimation procedure wa& limited to tests of 
not more than about ten items. More recently Gustafsson, U977) has 
reprogrammed the algorithm such that it can handle up to 60 or 70 
items satisfactorily. It still remains, however, very expensive when 

compared to other approaches. * 

From 1969 on, Wright and^vario^s^associates at Chicago developed 
Vstreamlined procedure bas.ed only on tjjie marginals of the observation 
matrix. The X values, based on the item responses, that led to the 



/ 
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marginals were used only f6Y cheSdng the' fit of- the data to the 
.model. The preferred statistical method was again that of maximum 
likelihood. While initial estimates of the 5's are. held constant, 
the^ a 's are adjusted to maximize the likelihood function -for the 
marginals. Then the a values a're held fixed while the 5 's are 
adjusted.. ..The cycle repeats until convergence is a^fieved (Wright, 

MeadJ* Bell, 1979). However, it was pointed out that this approach 

. J + \ 

(dutfbed UCON) produce^ biased estimates, because the ability 

, parameters and their #rrors of estimate have not been conditioned out * 

*of\ the^item difficulty calibration tand vice^versa)' as they had been 

in the Rpch procedure, described above. Wright aijd Douglas (1977) 

proposed a simple correction factor" which effectively removed most Of 

the bias and the wssult was a fast and efficient estimation algorithm 

■• » ' . . 

that could accurately recover the- parameters used to „. generate V 

. i? * . ■ 

artificial data. The algorithm yields standard errors for all the 

parameter estimates as byproducts -of the' calculation, *and lends itself u 

' Y 
to several tests of .fit of the data to the model. For the last 

. decade, It has been the method used in most Rasch scaling exercises 

throughout the United States and ih a number of other countries-. 

In educational applications this algorithm's main .shortcoming » • 

is Its inability to. handle missing data in* an appropriate fashion. 

More specifically, . the algorithm requires a complete rectangular \ 
,( persons-by-i terns matrix in which each element is a one (representing a 

correct response) or a zero (representing, an incorrect response). > 



0 

i 



In practice this causes problems: 

(a) when, as in survey designs based on ^matrix sampling or in • 
the use* of an Mtem bank, i1^ is deliberately planned that 
different students should attempt .different items, 

(b) when the intention is to collect a complete set of data, but 
for various reasons (e.g., incorrectly assembled tests, 
student illness, errors in coding or data processing) gaps 
occur in the set of data prepared for statistical analysis, * 

(c) whert the collected data setis complete but it is desired t»s. 
edit it selectively (e.g., to remove .obvious guesses) before 
the analysis is carried out. % t 

This paper is directed towards a strategy for, overcoming^hese 

problems. 

The Separation of Ability and Difficulty Parameters. 

In his 1960 text; Rasch also described a method of estimation 
based on the comparison of two or three items at a time (p. 17^-174). 
He gatfe credit for discovery of the algorithm to G. Leunbach, the Head 
of the Statistical Unit in "the Danish; Institute Tor Educational 
Research. The main thrust of this al^rithm is the manipulation of 
the data matrix in ^der to separate out the information needed /for 
the estimation of the item difficulty parameters 5. Conditioning 



out the ability parameters a in this way ^avoids the biasing of the 
parameters described above. In fact, this" procedure corresponds 
closely to conventional practice in the natural sciences: the 
calibration of instruments, independent of the objects, to which they 

if * 

are eventually to be applied, precedes their use for measurement. 



if • ' ' - 
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The algebraic presentation of this ."pairwise" algorithm has been 
.updated to conform to current notation, but it follows the logical 
sequence used by Rasch. , f 

The basic. model we shall use is 

Prob B 



[Xvi = 1 ] = 



where X, a v and 5i are as defined dti page 1. „ : 

For many purposes it is simpler to rewrite equation 1 as 



Prob ! X vi = 1 I = " * . r v 



r A ' e v 



Prob [x vi = oj - -^-j; 



which leads td the "odds" (I.e., P/|l-P)l of ,a correct response 
[x v i = l] = 



Odds I XnH =11= .e ( ? v " 6 i } 



i • 
Consider now the possible outcomes when person v attempts two 

items i and j- Note that the local independence assumption of the 

Rasch model requires the responses to the two items be independent* 

Four separate cases need to considered. 



- 6 - 



Case (i) - both items qprrect 



prob 



|_a vi = l, a vj = lj = j-jj 



Case (ii) - both items incorrect 



Prob 



[a 7i .= 0, a V j =0] = 



tie* 



e 



Case (iii) -item i correct; item j incorrect 



Prob 



[a vf « 1, a vi « OJ - — • — ; 



e w *e 

Case. (iv) -, item- i incorrect; item j correct 



Prob 



[a v i = °» a vj = l] = 



e a v+e 6i 



\ 



The first two cases hold little interest. A moment's reflection 
reveals that the information they provide about the ability of person 
v is distinctly limLted, and they provide no information at all about 
the relative difficulties of items i and j. * < 
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^a£es (iii) and (iv) are 1 somewhat different. If attention is 
restricted to these two,v foe both of which* <- 



a vi + a vgi 



^^Bfien wg- can wri te 



Prob 

t 



[avi+avj =1 ] 



^AeW (eSe 6 ' )(e°*e 6 ' ) - 
•e a »(e 5i +e S M 



( e v +e 



/Vte 6i ) 



If, .therefore, we know that person v scored exactly one on«4he 
item pair (i,^l^then we can write conditional probabilities: 



Prob I a v j=' 




(e 6i + e 6 i ) 



11 



and similarly ' 

ProB^a v j=l [- avi+a v j =1 ] 



e 



(e 6r +e 6 * ) 
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• ' Th^, ability parameter a V ha § been eliminated entirely from 
\t»s^ two expressions. If we know that an individual scores just one 
on any item pair, the probability that -it was one rather than 'the 
other that was answered correctly depends solely on the relative 
difficulty of th,e< two items'.. • . 

The fundamental importance of this separation of the a md 6 
parameter sets' (by means of conditional probability) to the whole 
process of measurement has been eloquently described by Rasch (1977). 
Even when the method is not used in parameter estimation, it is the 
fact that the model permits the separation that qualifies it for 
membership in the class' of "specifically objective" measurement 
models— meeting the criteria jdrawp up by Thurstone as long ago a^s 

1928. > • » • 

The 4 probability of having, a correct /esponse to item t, given 
aaT~of the two . responses to items i and Vone. is^ righj and one is 
wrong, caji be estimated by observing the resalts'of a \apge number of 
people, who attempt these two items. If we define^frfj to be the number 
of people who respond correctly to i and incorrectly to j, (with bji 
similarly defined) then we cart write, 



bij 

Jf bij + bji 



as an estimate of Prob^a v i=l^| avi+ayj-l] ^ 



c 



- 9 - 



since this conditional probability does not depend on the a v and is 

i 

the same for all people in the group. , 
I.e., estimates e 1 



; b ij + b ji ; . . • e^e 6j 

or" 6l] estimates' ■ C «j - «. > 

which is to say that 

Cjtj - 6^ is estimated by log bij - log bj-j. 

For every pair of items in a test, we can calculate the valtas of 
b^j and bjj and hence obtain an estimate of the Velative difficulty of 
the two items concerned. This is more than sufficient information to 
estimate the relative difficulties of all the items. 

The UCON approach described at the beginning makes progress by 
summarizing the original matrix of l's and O's into a (k+1) by k 
matrix, where for each of (k+1) possible raw scores, the number of 
correct responses to each of k items is recorded. By contrast, the 
PAIR approach works by summarizing the 'original matrix into a k>U 
matrix of values. However, as can be seen in Figure 1,- each 
sumiary matrix contains only k(k-l) Useful values for the estimation 
' procedure since two rows in the UCON summary matrix have fixed values, 
and the leading diagonal entries of the PAIR. summary are always empty. 
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Figure 1: * 
Data Reduction Strategies for Rasch Parameter Estimation 
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" # correct responses to item i. 
la: Score-group Summarization 
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Tfyere is of course an analogous matrix of couQts^describ.ing the 
relative abilities of, the persons, by x^nsTde>nng them two at a time 
and looking only at those items which one got ri^ght and the other got 
wrong. r This effectively eliminates <5 from the data, and produces a 
summary matrix with information about the persons. "In practice this 
h^s been little expTored for two reasons. First, there are typically 
more persons than items in a set of test data, so the peVson-person 
summary produces a larger matrix with smaller cell entries. Second, 
since the ultimate goal is usually to measure the persons individually 
in terms of their performance on the items, - it seems logical first to 
process the data in order to obtain the best possible calibration of 

the items, and then to apply these calibrations to the measurement of 

■ / 

the persons. Nevertheless the comparison of the measurements obtained 
by the different methods holds considerable theoretical interest and 
deserves detailed 1nvestigatio|i in the near- future. In this paper, 
however, I shall concentrate on the prior calibration of the items 
before any measurement of persons is attempted. 
Estimating the Di&iculty Parameters 

To calibrate a set of items from a matrix of observations a 

complete matrix B is constructed with elements bjj as. defined above. 

t 

Note that the matrix of observations need not be complete.- An 
individual 'who' is exposed to- items i and j gets an opportunity to 
contribute to bjj or bj-f, and thus to the estimation of 6f and 
<5j. It is not necessary^for this individual to attempt all (or 
indeed any) of the other items in the set. This }s the algorithm's 
great strength fn practical applications. 



"The practical solution to the estimation task of item parameters 
with the pairwise approach, described by Rasch (1960) and first 
demonstrated by Choppin (1965) % at a meeting of the Midwestern 
Psychological Association, amounted to taking the logarithms of the 
off -diagonal elements in the B matrix and summing them to get row and 
column marginals. The difference of the sum of the ith row and the 
ith column is 

Si = log(b Li ) + log(b 2i ) + log(b 31 ) + ... + log(b ki ) 

- [log(b il T+ log(b i2 ) + log(b i3 ) + ... + lQ9(b ik )J 



which estimates J#l 

^ U i - fij) or k6. - D 
J Rk 

where D is the sum of 6j over all j. • 
Note that the model, and equation 7 which we Kfave derived from it, has 
nothing to say about the absolute value of the parameters, only their 
relative magnitude. If we have a set of a's and 6's which satisfy 
equation 1, then the new sets produced by adding a constant' to all the 
old values will also satisfy the equation. No "absolute" zero is 



defined /on tbeCabHlty or difficulty scale, and ~1t hgs become, 
conventional to aWrtparily fix the mean 6-valiie for a particular set 
of Items at 0, since this si^jp^fies the algebra. Parameters can 
later be adjusted to other zero points, and Indeed other units, 1f so 
desired. 

But taking D « 0, we have the" estimation equation 
Gi - lc6, . 

. or 6f,--L. ^ 

This approach, which js simple and effective when the values of 
bjj are large and fairly homogeneous, breaks down completely when one 
or more values "of bjj (1 t j) are zero,, since the logarithmic function 
1s 'not then defined. This appeared to be a major stumbling block, 
since zero values 1n the B matrix are met quite often 1n practice, and 
led to the approach' almost being abandoned. 

However, Choppln (1982) pointed out that Rasch's discussion of 
Item triplets (Rasch, 1960, p. 173) can be extended. 

If e (6i-6j) can be estimated by bl , then It can also be 

b . . 
1j 

•b kj b 1k \T e j 



estimated by • , ±K m h +6 > 



A better estimate yet 1s obtained by pooling information across 
items (i.e., by summing over the double subscript in both top. and 
bottom of the expression). This gives 



V 
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k 



'«rV E b jk' b ki' _* 



where b*^j are the elements of B* the square of the original summary 

matrix t B. > 

In general B* will not contain off-diagonal zero elements, unless 
the interns are inadequately linked in the sample design (when the 
complete simultaneous Estimation of theljf difficulties is 
Impossible)* Squaring the original B matrix is thus a way 0/ avoiding 
the problem of * zero entries, and *■ leads qu+ckly to a set of 

- < 

6 estimates. ■ ■ * ■ . S t * 

* A major drawback, however, is that ttye manipulation demonstrated 

in the preceding two paragraphs is only valid for data sets that "fit" 
the model/ In practice some misfit may be expected to occur, and it 
is impgrtant to know about it. The method outlined above will produce 
5 estimates from virtually any set of test data, and experience 
suggests that the B* matrix is closer to the stucture prescribed by 
the model than the original B. 

In general, this approach to parameter estimation (which 
corresponds to a least squares procedure) is not recommended except 
where strong a priori evidence suggests that data will conform well to 
the requirements of the model. 
Maximum Likelihood Estimation . * 

♦ A more satisfactory method of unravelling the information stored 
in the B matrix is that of maximum likelihood. Suppose that in matrix 

18 



N 



B, njj individuals score exactly one on -the item pair (i, j), so that 
we can write: 

n . . - b J- + b • • / <» 

ij n Ji 

Now for any individual in this group, the probability that he 
gets item i right and item j wrong is: 



e 6 ; 



e'+e J 



and conversely, the probability that i is wrong and j is right is: 

x 

/ 



e 6i +e 6 j 



. From' the binomial theoreV the probability of. the n^j individuals 
dividing into exactly bjj and bj-f subgroups is: 



b ij l b ji ! (e 6 '+e 6 i) ni J 



and the likelihood of the entire B matrix, piven a matrix N of rHj 
elements is: / 



- 16 - 
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The problem now is to find a set of Si's which maximize the 
value of this function. This maximum_and the maximum of its logarithm 
occur at the same point, so for simplicity we may. write the log 
likelihood; ' 

' ft, * ^ 

• c ♦ 2 (b ij a j + b ji 5 i' - 2 tb ij + V 109 (e + 6 J 1 



where C is a function of the b 's but not of the 6's. 
For maximum likelihood ~ ■ 0 for each i 



\ 



1/ e 6 ^ -(b j/bji) I (all i ) 



(e 6i +e 6 i ) 



i * /(b. . + b..) e 6 ' . 
or Vb, = T (1-4 if- (all 1) - 
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As before, it is necessary to insert an additional linear 
constraint on the set of 6 's since it is clear that adding any 
constant value to all the <5's in the above equations would^not 
change the nature of the solution. Rasch scaling deals in ^rel ative 
difficulty rather than absolute difficulty and no zero point on the 
scale can be uniquely defined. However, once the 6-value for one 
. item has been fixed (if necessarily arbitrarily) then all the other 
6's can be defined by relating them to the. first. v The usual 

^constraint is to put the sum of the 6-values equal to zero. 

r * 

' This set of k equations in k unknowns can be solved by various 
"iterative techniques. Two that have been found to work well iri 
practice are: - • „, 



Cn+l) 

5 = log 



and the Newton -Rap h son procedure: 



j ^ (b.j+b^.) 6 



6=0. — ; : -7 — — 



a- 
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In general an efficient procedure has proved to be to set the 

initial (0) 6 -values^ all equal to zero, and then to apply the- first 

iterative set of equations three or four times., This\ produces a s£t 
♦ ' \ ' j * 4 

of reasonably good approximations which, when used with the 

Newton-Raphspn equations, leads rapidly to convergence and a solution 

for the various 6*s. * 

A great advantage of the maximum likelihood procedure is that it 

can be used to generate standard errors for the estimated parameters 

(JCendall 4 Stuart, 1969). The variance-covariance matrix V js the 

inverse of a matrix whose elements are: 



t 



d 2 L 



evaluated at the maximum likelrhood solution. In practice, however, a 
simpler approximation 



seems adequate, and is- recommended for routine use. 



u 
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^Estimating the Ability Parameters - y/ 

If Xyi (i = 1, k) auce'the set of scored responses for person v f 
whose total test score t^ry, and whose ability paranieter is a v ; then 
we may write: 

' - • * V 

where the summation runs over the^setpf items attempted by person v. 

The Tikeljhood of the response X V1 - according- to the model is: 

e Xvi(av " 5|) 



1 + e (av-5 i ) 



since X v i takes- values one? and zero accordingly as the response is 
right or wrong. ^ ( . 

From this, the likelihood function, for the, ^entire set of 
responses (Xyi) for person v is: - . 



The logarithm of thij futiction: 



L-(Ki>.-ly.-).- 2io9(i + . ( "«- 5 '') •' 
. •. (v, - - £10,^+ •<•.-«'>)* 



£or the ML solution, * 0. It should be noted that in this 

case the - 8 's are regarded as already known, so that Vv-is tne 
only parameter to be estimated. 
/ 

dL - V >v»-«0 

0 = d^7 = r v + e U v -«i ) - 



This equation does not contain the item- response (X v i). It 
demonstrates a result, already obtained by othe/ writers, that the 
ability estimate depends not upon the particular pattern of item 
responses obtained, but only upon the "total score." r is a 
sufficient statistic for ability, and the conventional practice of 
using total scores as measures has a logical foundation. 

^ — . ' ■ 

The equation r v = > ; — . . 

can be solved for <*v 



' - 21 - -. 
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f 

The score on the test stakes values 0, 1, 2, ... k. Each of 
the k terms on the right hand- side of the equation lies between tT and 
1 for Veal values of X yi and 6j. Note that there $re , no solutions 
for v r = 0 and r = k. For these values the likeliho<j^ function has 
no maximum, and this could have been anticipated. If an indiviikial 
responds correctly to every item (i.e., r v = Jc), then* we have no 
Information on which to base , any* upper bound for an ability estimate. 
Similarly, if- every item is answered incorrectly, there ^are no data to 
suggest just how low the level of ability might be. t 

Note that once a set of items "has been calibrated, (i.e., the 

» 

6"s have been estimated), it is possible to estimate an ability 
parameter for each possible score on the test, regardless' of whether 
or not any individual actual ly T obtains such a score. If a test is 
constructed by selecting items from an already calibrated item t bank, 
then ability parameters for all possible scores on the new test can be 
calculated even heforeHthe test is used. ' 

The standard errors of the ability parameters, corresponding as 
they do to the standard error of measurement, are usually of more 
interest than the standard errors of the item difficulties. 
Furthermore, they are typically considerably larger, si nee Ythe ability 
parameter estimates are based upon only k observations (usually 
between 10 and lpO) whereas • item calibration i\ typical ly J>ased upon 
the results of^at least several hundred individuals. 

'in general , if we assume that the 6's are established! with some 
precision, the standard errors of the a's can be developed from the 



-7 *& 

' > . . r ^. 



4jgg likelthpocl ^function. 



L - a/J^ £ X y1 6. - £ 1og[i + e (ct y; 5i Sj 



dL = r 
da.. • 



E e (ct v^j ) 
. 1 + e Uv-«i 



) 



d 2 L _ iX? [I>e (ct v-He^' 5i ie 2(ctv " 6i) 



d ♦ e («v-«i )] : 



«53 



Thsn the standard $rror of measurement for an individual who 
receives an ability estimate, a v for his responses to items with 
difficulties 6i is: 



1 do? 



-k 

The second differential reaches a maximum value of — when all 
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the 6*s are equal to a v . In practice, if the 6*s are all fairly 
close to ay M"«e. f if the items are all closely matched to the 
person's ability), then the , second differential remains close to 

it o 
m 2— and so the standard error is given approximately by T&- logi t$; 

4 Vk 
in general, however, we can write 

S a w > ^ l0 9 US 

ft 

whatever the distribution of the 6's. 
Tests of Fit 

Control of the model ,by validating its conformity to the 
structure of a particular data set, within pre-specifiable limits, has 
been difficult to achieve with the PAIR method of parameter 
estimation. The most frequently used approach has been the non-random 
splitting of the original data set 'into two parts -based on some 
characteristic of the persons, calibrating the full set of items for 
each part of the data separately., arid plotting" the results against one 
another. This is inexpensive,, straightforward, and has considerable 
utility although it lacks mathematical elegance. A division of the 
sample of persons' into high performers and' low performeVs at the 
median raw score is the most severe test of the anticipated invar iance 
' of item difficulty parameters. It focuses directly on the assumption" 
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of equal discriminating power . for all items in the set, and the 
associated though contrary threat, ability-related random guessing. 
Since the same set of item parameters is being estimated for both high 
and low ability groups, and the mean difficulty in each case is being 
fixed at zero, the model predicts that within the limits of sampling 
error the same item calibrations should emerge from each half of the 
analysis. Figure 2 demonstrates typical results from two multiple- 
choice achievement tests, one of which fits the model very well and 
one of which shows considerable evidence of guessing.' Experience with 
plots of this type shows that they can be very informative, and Figure 
3 shows a somewhat simplified guide to their 'interpretation. 

Of course other splits can be used to generate these plots. For 
example, to test for the presence of sex bias>within a test it is 
possible to plot calibration obtained from males against those 
obtained from females. The plot reproduced in Figure 4 contains item 
• difficulties for a mathematics test calibrated for groups of students 
who studied two different^turricula. Analysis of the discrepancies 
from the predicted straight line showed how pattern of learning 
produced by the new curriculum was different from that of the old (and 
these differences were not in accord with the intentions of the 
curriculum development team. Choppin, 1977). . 

A more detailed control of the model requires going back to the 
original persons-by-items data matrix and' estimating the probability 
of a correct response 'for each person/item interaction based on the 
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Figure 2 A : Croaa-calibratiori of items for a test that 'fits' - 
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Figure 2B : Cross-calibration of items demonstrating guessing . 
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Figure 3*. Guidfr to Interpretation 



high 
discrimination 

\ 

N 

\ 



u 




fo 


nts 


CO 


0) 






o 










CO 


cd 




M 


>* 




4J 


•H 


•H 


iH 




cd 




o 


ab 


a 




0) 




4J 


o 




.H 



\ 




/ 



item calibrations for 



\ 



high ability students 



\ 



\ 




\ 

\ 

low 

discrimination 



^ / / 1 REGION OF 



o 

ERIC 



Figure 4: Mathematics Test, ItefflJ Difficulty 
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estimates of d and / 6 • When these probabilities are compared to 
the observations, a matrix of residuals is generated. Thi% topic has 
been well covered in Mead (1975) and Wright and Stone (1979) and will 
not be further developed here/* . 

Rasch (I960, p. 174) suggested that the examination of ^Mtem 
triplets might offer an effective control of the model , but this has 

not proved to be the case. Recently, however, it has been noticed 

m ■ 

that the comparison of the B and B* matrices offers a 



concise test of the local independence assumption. The ratio 



estimates (S.-Sj) ^i§noting all other items, whereas 



ji 



b* 



ij 



1j _ 

estimates 



only through the comparison of i and j to the other items. If there 
exists a local contextual effect (e.gl, if item 15 is easier than it 
would otherwise be because it comes immediately after item 14), then 
the b and b* values should show it. This comparison is accomplished 
by a X* statistic. The method holds considerable promise since 
the ai&mption of local independence has been strongly attacked as 
unrealistic (Goldstein, 1979). A number of studies of achievement 
test data in which items are administered in different orders and with 
or without other groups of items suggests that the local independence 
assumption is often well met in practice, although other evidence 
(Tang, 1982) suggests that an a test of reasoning skills such as a 
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progressive matrices test, the context is extremely important. It 
seems probable that the (B,B*) comparison will be used for testing 
local independence even when parameter estimation is achieved through 

* » 

UCON or maximum likelihood PAIR. 
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